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Abstract 

We study the moments and the distribution of the discrete Choquet integral when 
regarded as a real function of a random sample drawn from a continuous distri- 
bution. Since the discrete Choquet integral includes weighted arithmetic means, 
ordered weighted averaging functions, and lattice polynomial functions as partic- 
ular cases, our results encompass the corresponding results for these aggregation 
functions. After detailing the results obtained in [1] in the uniform case, we present 
results for the standard exponential case, show how approximations of the moments 
can be obtained for other continuous distributions such as the standard normal, 
and elaborate on the asymptotic distribution of the Choquet integral. The results 
presented in this work can be used to improve the interpretation of discrete Choquet 
integrals when employed as aggregation functions. 
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1 Introduction 

Aggregation functions are of central importance in many fields such as statis- 
tics, information fusion, risk analysis, or decision theory. In this paper, the pri- 
mary object of interest is a natural extension of the weighted arithmetic mean 
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known as the (discrete) Choquet integral [2-4]. Also known in discrete mathe- 
matics as the Lovdsz extension of pseudo-Boolean functions [5], the Choquet 
integral is a very flexible aggregation function that includes weighted arith- 
metic means, ordered weighted averaging functions [6] , and lattice polynomial 
functions as special cases [7,1]. 

Although the Choquet integral has been extensively employed as an aggrega- 
tion function (see e.g. [8] for an overview), its moments and its distribution 
seem to have never been thoroughly studied from a theoretical perspective. 
The aim of this work is to attempt to fill this gap in the case when the Cho- 
quet integral is regarded as a real function of a random sample drawn from a 
continuous distribution. 

The starting point of our study is a natural distributional relationship be- 
tween linear combinations of order statistics and the Choquet integral, which 
merely results from the piecewise linear decomposition of the latter. As a 
consequence, exact formulations of the moments and the distribution of the 
Choquet integral can be provided whenever exact formulations are known for 
linear combinations of order statistics. Likewise, approximation and asymp- 
totic results can be provided whenever available for linear combinations of 
order statistics. 

The paper is organized as follows. In the second section, we recall the definition 
of the discrete Choquet integral. The third section is devoted to the expression 
of the distribution (resp. the moments) of the Choquet integral in terms of the 
distribution (resp. the moments) of linear combinations of order statistics. The 
case of standard uniform input variables is treated in Section 4. More precisely, 
the results obtained in [1] are detailed, and algorithms for computing the 
probability density function (p.d.f.) and the cumulative distribution function 
(c.d.f.) of the Choquet integral are provided. The fifth section deals with the 
standard exponential case, while the sixth one shows how approximations 
of moments can be obtained for other continuous distribution such as the 
standard normal. In the last section, we discuss conditions under which the 
asymptotic distribution of the Choquet integral is a mixture of normals. 

The results obtained in this work have numerous applications. The most im- 
mediate ones are related to the interpretation of the Choquet integral when 
seen as an aggregation function. In multicriteria decision aiding in particular, 
the presented results can be used to generalize the behavioral indices studied 
e.g. in [9,10]. In classifier fusion, they can enable a theoretical study of the 
so-called fuzzy approach to classifier combination (see e.g. [11]) in the spirit of 
that done in [12]. 

Note that most of the methods and algorithms discussed in this work have been 
implemented in the R package kappalab [13] available on the Comprehensive 
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R Archive Network (http : //CRAN . R-pro j ect . org) . 



2 The discrete Choquet integral 

Define N :— {1, . . . ,n} as a, set of attributes, criteria, or players, and denote 
by &n the set of permutations on N. A set function i/ : 2^ — > [0, 1] is said to 
be a game on N if it satisfies = 0. 

Definition 1 The discrete Choquet integral o/a; e R" w.r.t. a game v on N 
is defined by 

n 
i=l 

where a e is such that Xa-[i) ^ • • • ^ x^^n)) where 

and where v1 :— v{{a{\)^ . . . , (7(i)}j for any i — 0, . . . ,n. In particular, :— 
0. 

Note that the permutation a in the defintion of the Choquet integral of x is 
traditionally taken such that Xa(i) ^ • • • ^ Xa{n)- The reason for not adopting 

this convention in this work is due to the fact that it would have led to much 
more complicated expressions of the results to be presented in Section 4. 

Prom the above definition, we see that the Choquet integral is a piecewise 
linear function that coincides with a weighted sum on each n-dimensional 
polyhedron 

{x G I ^ ■■■ ^ (t G 6„, (1) 

whose union covers M". It can additionally be immediately verified that it is 
a continuous function. 

When defined as above, the Choquet integral coincides with the Lovdsz ex- 
tension [5] of the unique pseudo-Boolean function that can be associated with 
u [14] and can be alternatively regarded as a linear combination of lattice 
polynomial functions (see e.g. [1]). 

In aggregation theory, it is natural to additionally require that the game u 
is monotone w.r.t. inclusion and satisfies i'{N) = 1, in which case it is called 
a capacity [2]. The resulting aggregation function C^, is then nondecreasing 
in each variable and coincides with a weighted arithmetic mean on each of 
the n-dimensional polyhedra defined by (1). Furthermore, in this case, for any 
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T C. N, the coefficient can be naturally interpreted as the weight or the 
importance of the subset T of attributes [4] . 

The Choquet integral w.r.t. a capacity satisfies very appealing properties for 
aggregation. For instance, it is comprised between the minimum and the max- 
imum, stable under the same transformations of interval scales in the sense of 
the theory of measurement, and coincides with a weighted arithmetic mean 
whenever the capacity is additive. An axiomatic characterization is provided 
in [4]. Moreover, the Choquet integral w.r.t. a capacity includes weighted arith- 
metic means, ordered weighted averaging functions [6] , and lattice polynomial 
functions as particular cases [7,1]. 



3 Distributional relationships with linear combinations of order 
statistics 



In the present section, we investigate the moments and the distribution of the 
Choquet integral when considered as a function of n continuous i.i.d. random 
variables. Our main theoretical results, stated in the following proposition 
and its corollary, yield expressions of the moments and the distribution of 
the Choquet integral in terms of the moments and the distribution of linear 
combinations of order statistics. 

Let Xi, . . . , Xn be a random sample drawn from a continuous c.d.f. F : M — > M 
with associated p.d.f. / : M — > R, and let Xi-^ ^ • • • ^ X^-.n denote the 
corresponding order statistics. Furthermore, let 



Yi, :— C^{Xi, . . . , Xn), 

n 
1=1 

Let also Fi,{y) and F^{y) be the c.d.f.s of Y^, and YJ, respectively. Finally, let 
/i : M — >■ M be any measurable function. 

Proposition 2 For any game v on N , we have 

E[Mn)] = ^ E E[M>7)]- 
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Proof. By definition, we have 

•^R" Z_i 



1=1 



„ / n \ n 



Using the well-known fact (see e.g. [15, §2.2]) that the joint p.d.f. of X]_.n ^ 

■ ■ ■ ^ ^nin is 

^'11 Xx^---^Xn, 

1=1 

we obtain 



E[ 

which completes the proof. □ 



Before going through the main corollary, recall that the plus (resp. minus) 
truncated power function x^ (resp. x") is defined to be if x > (resp. 
X < 0) and zero otherwise. 

CoroUciry 3 For any game u on N , we have 



Proof. Define hy{x) := [x — y)^_. Then, from Proposition 2, for any y G R, 
we have 



□ 



The results stated in Proposition 2 and Corollary 3 are not very surprising. 
Prom Definition 1, it is clear that the Choquet integral is a linear combination 
of order statistics whose coefficients depend on the ordering of the arguments. 
The different possible orderings merely lead to a division of the integration 
domain M" into the subdomains (o" G ©«) defined in (1), and the difficult 
part still lies in the evaluation of the moments and the distribution of linear 
combinations of order statistics. 
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The relationship for the raw moments is obtained by considering the special 
case h{x) = x^, which may still lead to tedious computations. Prom Proposi- 
tion 2, we obtain 



ft' ^ fT- 1 1 



o-eSn fe=i 



and more generally, 



e[f;: 



E E E 



-ki-\-\:n 



Unfortunately, this latter formula involves a huge number of terms, namely 
nXyf . The following result (see [1, Prop. 3] for the uniform case) yields the 
rth raw moment as a sum of (r + 1)" terms, each of which is a product of 
coefficients vi^T^. 

Proposition 4 For any integer r ^ 1 and any game v on N , setting T^+i 
N and XQ-n :— 0, we have 



E[i;i = E 



T,c..Uc» mo! ■ • • iT]„! \t.\(;]^f) 

where [T]j represents the number of "j" among |Ti|, . . . , |Tr|. 



Yli^n-lTil+l-.n-Xn-lTil-.n) 



Proof. Fix a & &n- Rewriting y^f as 



— E i-^n-i+l-.n ~ ^n-i:n), 

and then using the multinomial theorem, we obtain 



{Yjy = E {Xn-^^l■.n ' ^n-.n)"^' 

ri,...,r„^0 'I- 1=0 

riH \-rn=r 

r! '' 

= E "iTn ITM n ^ik {^n-ik+l-n — ^n-ik-.n), 

where [i]j represents the number of "j" among ii, . . . , ir- Now, using Proposi- 
tion 2 with h{x) — x^, we immediately obtain 



E 



r! 



E 



fe=l 



ni 



E n<- 
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The final result then follows from the identity (see the proof of [1, Prop. 3]) 



E n 

CTSSn fc=l 



E n 



^\T^ + l 

TiQ-CTrQN i=l I ij-.i 
|Ti|=n,...,|T^|=V 



□ 



For example, the first two raw moments are 

u{T) 



^[^1^]= E 7^ E[X„_|T|+l:n - ^n-|r|:n] 
TCN (^i^ij 



and 



^[^21 ^ v 2 z/(Ti);/(T2, 

that is, 



(2) 



{Xn-\Ti\+l:n~Xn-\Ti\:n){^n-\T2\+l:n~^n-\T2\:n) 



-|Ti|+l:n ^n-|Ti|:n)(^n-|T2|+l:n ^n-|T2|:n) 



l2 



+ E / „ \ E ^n-|r|+l:n " -'^n-ITIm ■ (3) 
TCN 



4 The uniform Ccise 



In this section, we focus on the moments and the distribution of Yi, when the 
random sample Xi, . . . , Xn is drawn from the standard uniform distribution. 
To emphasize this last point, as classically done, we shall denote the random 
sample as Ui, . . . ,Un and the corresponding order statistics by C/i:„ ^ • • • 

Ufi-.n- 

Before detailing the results obtained in [1] and providing algorithms for com- 
puting the p.d.f. and the c.d.f. of the Choquet integral, we recall some basic 
material related to divided differences (see e.g. [16-18] for further details). 



4.1 Divided differences 



Let A^^^ be the set of n — 1 times differentiable one-place functions g such 
that gr("-i) is absolutely continuous. The nth divided difference of a function 
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g G A^'^^ is the symmetric function of n + 1 arguments defined inductively by 
: oo] := 5^(00) and 



A[g :ai,...,an]~A[g:ao,..., a„-i] 



A[g : ao, . . • , On] := 



A[g : ao,...,a„_i], 



, if Oo 7^ a„, 
if ao = a„. 



The Peano representation of the divided differences is given by 

1 f 

A[g : ao, . . . , a„] = — / g^'^^t) M{t | ao, . . . , a„) dt, 

where M{t \ ao, ■ ■ ■ , an) is the B-spline of order n, with knots {ao, . 
defined as 

M{t I ao,...,a„) :^ n A[{- - t)l-^ : ao,...,a„]. 



(4) 



We also recall the Hermite-Genocchi formula: For any function g e .4*^"'^ we 
have 



r r 

A[g : ao, . . . , a„] = / g'^"^ ao + " Oi-Oa^i 



dx, (5) 



where 7?^^ is the region defined in (1) when a is the identity permutation. 

For distinct arguments aQ, . . . ,an, we also have the following formula, which 
can be verified by induction. 



A[g : ao, . . . ,a„] = ^ 



i=0 



(6) 



4.2 Moments and distribution 



Let g e A^""^. From (5), we immediately have that 



E 



i=l 



gin) f ^ p'^,'^ Un-^+l:n) = n\ A[g : , . . . , < 



(7) 



since the joint p.d.f. of ^ • • • ^ Un-.n is l/n\ on Rid fl [0, 1]" and zero 
elsewhere. 



Now, combining (7) with Proposition 2, we obtain 



(8) 



Eq. (8) provides the expectation E[(?*^"^(yj,)] in terms of the divided differences 
of g with arguments z/q , . . . , z/^^^ (ct G ©„). An expUcit formula can be obtained 
by (6) whenever the arguments are distinct for every a e ©„. 

Clearly, the special cases 

9i^) = r^Ni - and ^ 

(n + r)! (n + r)! 

give, respectively, the raw moments, the central moments, and the moment- 
generating function of Y^. As far as the raw moments are concerned, we have 
the following result [1, Prop. 3], which is a special case of Proposition 4. 

Proposition 5 For any integer r ^ 1 and any game v on N , setting T^+i 
N , we have 

^\Xu] = -j^^ n /|T,+!|\ • 

\ r ) TiC...Cr.CJV i=l [ i^^i ) 

Proposition 5 provides an explicit expression for the rth raw moment of as 
a sum of (r + 1)" terms. For instance, the first two moments are 



TCAT 



By using (8) with g{x) — ^{x — y)", we also obtain the c.d.f. F^{y) of Fj, [1]. 
Theorem 6 There holds 



1 ^ . _ 1 



My)--, E ^[(•-?/)-^^o,---,<] = i--7 E N(•-y)^^o^■■■,^„ 



It follows from (11) that the p.d.f. of F;^ is simply given by 



(11) 



^ E A[(.-y)r^:^o^ ■■■,<], (12) 



or, using the B-spline notation (4), by 



/-(?/) E M{y\vl,...,vl). 
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Remark: 



(i) When the arguments z/q , ■ ■ ■ ,i^n distinct for every a G 6„, then com- 
bining (6) with (11) immediately yields the following explicit expressions 

P (y)^±_ V V = 1 - — V T + 

(ii) The case of linear combinations of order statistics, called ordered weighted 
averaging operators in aggregation theory (see e.g. [6]), is of particular 
interest. In this case, each z/f is independent of o", so that we can write 
Vi := . The main formulas then reduce to (see e.g. [19,20]) 

E[^(")(y,)]=n!A[5:i.o,...,i^n], 

F,{y)^A[{--y)l:uo,...,iyn], 
fu{y)^M{y \vQ,...,Vn)- 

Note also that the Hermite-Genocchi formula (5) provides nice geomet- 
ric interpretations of Fy[y) and fv{y) in terms of volumes of shces and 
sections of canonical simphces (see also [21,22]). 



4-3 Algorithms 



Both the functions Fy and fy require the computation of divided differences of 
truncated power functions. On this issue, we recall a recurrence equation, due 
to de Boor [23] and rediscovered independently by Varsi [24] (see also [21]), 
which allows to compute A[(' — y)'^^ : ao, . . . , a„] in O(n^) operations. 

Rename as bi, . . . ,br the elements such that (ii < y and the 
elements such that a, ^ y so that r -\- s = n + 1. Then, the unique solution 
of the recurrence equation 

oik,i = 7 , k^rj^s, 

ci - bk 

with initial values cci^i = (ci — &i)~^ and ckq,; = Oikjo — for all l,k 2, is 
given by 

ak,i := A[(. - |/)^+'-2 :bi,...,bk,cu...,ci], k + l^2. 

In order to compute A[(- — y)"^^ : ao,...,an] = ar,s, it suffices therefore to 
compute the sequence ak^i for /c + / ^ 2, /c ^ r, / ^ s, by means of two nested 
loops, one on k, the other on /. We detail this computation in Algorithm 1 
(see also [21,24]). 
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Algorithm 1 Algorithm for the computation of A[(- — y)" ^ : oq, . . . , On ■ 
Require: n, Qq, . . . , a„, y 

S ^0, R^O 
for ?' = 0, 1, . . . , n do 
if x'i — y ^ then 
S^S + 1 
Cs ^ Xi-y 
else 

R+1 
Br^ Xi-y 
end if 
end for 

Aq <— 0, Ai ^ ^/{Ci — Bi) {Initialization of the unidimensional temporary 
array of size S* + 1 necessary for the computation of the divided difference} 
for j = 2, . . . , 5 do 

A. ^ -B,A,_,/{C, - B,) 
end for 

for i = 2, . . . , i? do 

for j = 1, . . . , S* do 
A,^{C,A,-B,A,_,)/{C,-Bi) 

end for 
end for 

return Ar {Contains the value of A[(- — : Oq, . . . , On]-} 



We can compute ^[{• — y)"i : gq, . . . , an] similarly. Indeed, the same recurrence 
equation applied to the initial values ao,i — for alH ^ 1 and ak,Q — 1 for all 
A; ^ 1, produces the solution 

ak,i A[(- - ■.bi,...,bk,ci,...,ci\, k + I ^ 1. 

Example 1 The Choquet integral is frequently used in multicriteria decision 
aiding, non-additive expected utility theory, or complexity analysis (see for in- 
stance [8] for an overview). For instance, when such an operator is used as an 
aggregation function in a given decision making problem, it is very informa- 
tive for the decision maker to know its distribution. In that context, one of the 
most natural a priori p.d.f.s on [0,1]" is the standard uniform, which makes 
the results presented in this section of particular interest. Let v be the capac- 
ity on N = {1,2,3} defined by z/({l}) = 0.1, z/({2}) = 0.2, z/({3}) = 0.55, 
!/({!, 2}) = 0.7, i/({l,3}) = 0.8, iy{{2,3}) = 0.6, and i/({l,2,3}) = 1. The 
p.d.f. of the Choquet integral w.r.t. u, which can be computed through (12) 
and by means of Algorithm 1, is represented in Figure 1 (left) by the solid 
line. The dotted line represents the p.d.f. estimated by the kernel method from 
10 000 randomly generated realizations of Ui, 1/2,1/3 using the R statistical 
system [25]. The expectation and the standard deviation can also be calculated 
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uniform 



exponential 




0.0 0.2 0.4 0.6 0.8 1.0 1 2 3 4 5 



Fig. 1. P.d.f.s of discrete Choquet integral (solid lines) in the standard uniform and 

standard exponential cases. The dotted lines represent the corresponding p.d.f.s 
estimated by the kernel method from 10 000 randomly generated realizations. 

through (9) and (10). We have 

E[i;] ^ 0.495 and \j^[Y;}] - E[y;]2 « 0.183. 

The sample mean and the variance of the above mentioned 10 000 realizations 
of the Choquet integral are 

y,, f=i 0.497 and Sy^ 0.183. 



5 The standard exponential case 

In the standard exponential case, i.e., when F{x) = 1 — e~^, x ^ 0, the 
exact distribution of the Choquet integral can be obtained if the numbers 
{^i}ieN,ae&n satisfy certain regularity conditions. The result is based on the 
following proposition (see [15, §6.5] and the references therein). 

Proposition 7 Let ai, . . . , a„ G M and let Xi, . . . , X„ be a random sample 
drawn from the standard exponential distribution. For any i E N, define 

1 " 

Then, if ci ^ Ck whenever i ^ k, and q > for all i & N, the p.d.f. of 
T = Er=i ciiXi-.n is given by 

The p.d.f. fu{y) of the Choquet integral then results from Corollary 3 and 
Proposition 7. 
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Corollary 8 Assume that, for any a e i/f/i 7^ ^1,1^ whenever i ^ k, and 
that i/f /i > for allie N. Then 

Proof. The result is a direct consequence of Corollary 3, Proposition 7, and 
the fact that, for any cr e 6„, 

1 " /y'^ 
n-i + lf^i n-i + V 

□ 



The first two moments of the order statistics in the standard exponential case 
are given (see e.g. [15, p. 52]) by 

m^■.n]= E I, (14) 
k=n—i+l 

and, if i < j, 

k=n-i+l 

Used in combination with (2) and (3), these expressions enable us to obtain 
the first two raw moments of the Choquet integral. 

Example 2 Consider again the capacity given in Example 1 and assume now 
that Xi, X2, X3 is a random sample from the standard exponential distribution. 
The p.d.f. of the Choquet integral w.r.t. u, which can he computed by means 
of (13), is represented in Figure 1 (right) by the solid line. The dotted line 
represents the p.d.f. estimated by the kernel method from 10 000 randomly 
generated realizations. 

Combining (I4) and (15) with (2) and (3), we obtain the following values: 

E[y^] 0.963 and yJ^[Y^] - ~ 0-624. 

The sample mean and the variance of the above mentioned 10 000 realizations 
of the Choquet integral are 

^ 0.964 and Sy^ ^ 0.630. 
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6 Approximations of the moments 



When F is neither the standard uniform, nor the standard exponential c.d.f., 
but and its derivatives can be easily computed, one can obtain approxima- 
tions of the moments of order statistics, and therefore of those of the Choquet 
integral, using the approach initially proposed by David and Johnson [26] . 

Let Ui, . . . ,Un be a random sample from the standard uniform distribution. 
The product moments of the corresponding order statistics are then given by 
the following formula: 



E 



rrij \ j: 



-j^ {ij + mi-\ \-mj — l)\ 



{ij + mi-\ h ruj-i - 1)! ' 



(16) 



where 1 ^ ii < ■ ■ ■ < ii ^ n. Now, it is well known that the c.d.f. of Xi:n is 
given by 



Pr[X,:„ ^ x] = E ( ^jF^(x)[l - FixT'^. 



j=i 



It immediately follows that 

Pr[F-i(C/,:„) ^ x] = Pr[C/,:„ ^ F{x)\ = Pr[X,:„ ^ x], 
i.e., that F~^{Ui:n) and are equal in distribution. 

Starting from this distributional equality, David and Johnson [26] expanded 
F^^(Ui-n) in a Taylor series around the point E[f/j.„] = i/(n + 1) in order to 
obtain approximations of product moments of non-uniform order statistics. 
Setting n := i/{n + 1), G := d := G{ri), Cf^ := G^^\ri), etc., we have 



Setting Si := 1 — rj, taking the expectation of the previous expression and 
using (16), the following approximation for the expectation of Xi-n can be 
obtained to order (n -|- 2)~^: 



2{n + 2)' 



n + 2)2 



■ (17) 
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Similarly, for the first product moment, we have 



J -.71} 



^ n + 2 



C r^(^) -I- 
2{n + 2) ' ' 2(n + 2) 



G,G 



(2) 



+ 



TiSj 

[n + 2y 



T,)Gf^Gf + - T,)Gt>Gf + lT,s,Gt>G) 



^(1)^(2) 



^{3)-^(l) 



+ 



+ 



+ 



fi^iGj 

(n + 2y 
{n + 2y 



\riSiG 



(4) 



1 



ri)G 



rjTjSjSj ^(2)^(2) 
4(71 + 2)2""^ ""^^ 



(2) 



\r,s,Gf + \{s,-r,)G 



(2) 



(18) 



The accuracy of the above approximations is discussed in [15, §4.6]. Note 
that Childs and Balakrishnan [27] have recently proposed MAPLE routines 
facilitating the computations and permitting the inclusion of higher order 
terms. 

As already mentioned, the previous expressions are useful only if G := 
and its derivatives can be easily computed. This is the case for instance when 
F is the standard normal c.d.f. Indeed, there exist algorithms that enable an 
accurate computation of and it can be verified (see e.g. [15, p 85]) that 
G« = (/oG)-S 



G 



/2oG" 

where / := F^^\ 



G(3) 



1 + 2^2 



and G'(^) 



G(7 + 6G2 
f^oG 



Prom a practical perspective, in order to obtain a better accuracy for E[Xj.„] 

and E[Xi:„Xj:„] in the standard normal case, one can use the expressions ob- 
tained to order (n+2)~^ in [26] and recalled in [27]. We do not reproduce these 
expressions here as they are very long. We provide however the expressions of 
G^^^ and G^^^ required for computing them: 



7 + G2(46 + 24G2) 



and 



G(127 + 326G2 + 96G^) 

7^ 



Example 3 Consider again the capacity given in Example 1 and assume now 
that the decision maker wants the standard normal as a priori p.d.f. Com- 
bining (17) and (18) with (2) and (3), we obtain the following approximate 
values: 

E[Y'^] -0.014 and sJ^Y^] - E[Y'^]2 0.615. 

For comparison, the sample mean and the variance of 10 000 independent 
realizations of the corresponding Choquet integral are 



-0.013 and s 



0.620. 
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7 Asymptotic distribution of the Choquet integral 



Conditions under which a hnear combination of order statistics is asymptoti- 
cally normal have been extensively studied in the statistical literature. A good 
synthesis on the subject is given in [15, §11.4]. Provided some regularity con- 
ditions are satisfied, typically on v and F in the context under consideration, 
the existing theoretical results, combined with Proposition 2, practically imply 
that, for large n, 1^ is approximately distributed as a mixture of n\ normals 
7V(E[y,'^], V[yj^]), a e 6„, each weighted by ^. 

From a practical perspective, the most useful result seems to be that of 
Stigler [28]. For any a e 6„, let J^''^ be a real function on [0,1] such that 
J^-''^{%jn) — npX-i+\- Then, can be rewritten as 



...... 

i=l 



where the subscript n in Y^.^ is added to emphasize dependence on the sample. 
Furthermore, let 

/oo /•! 
xr^''\F{x)\dF{x) = / r'''{u)F-\u)du, 
-oo Jo 

and 

/?2(J^'<^, F) : = 2 / r'''{F{x))r'''{F{y))F{x){l - F{y))dxdy 

J —oo<x<y<+oo 

= 2 / r'''{u)r-''{v)u{l - v)dF-\u)dF-\v). 

Jo<u<v<l 

Then, Stigler's results [28, Theorems 2 and 3] (see also [15, Theorem 11.4]) 
state that, if F has a finite variance and if J''''^ is bounded and continuous 
almost everywhere w.r.t. F~^, one has 

Jim E[y-J = a(J'^■^F), }^nV[YJJ = P'{r'',F), 
and, if additionally p^{J''''',F) > 0, 

Y- - E[y- ] 



N{0, 1) as n — > oo. 



Example 4 To illustrate the applicability of these results, consider the fol- 
lowing game u on N defined by 

~{n\ n J 
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0.10 0.15 0.20 0.25 0.30 0.35 



Fig. 2. Approximations of the p.d.f.s of discrete Choquet integral by mixtures of 
normals (solid lines) for n = 3,5, 10 and 20. The dotted lines represent the corre- 
sponding p.d.f.s estimated by the kernel method from 10 000 randomly generated 
realizations. 

where a is a strictly positive real number. We then have 



v.a 
Pi 



n 



n 



i + iy 



n 



Wi eN, Va e 6„. 



As the coefficients p^"" do not depend on a, the corresponding Choquet integral 
is merely a linear combination of order statistics. Note however that the game 

V is by no means additive. Next, define J^''^{x) :— x"', for all x £ [0, 1]. Then, 
clearly, J^'"{i/n) — np'^Zi+i for all i & N. 

In order to simplify the calculations, assume furthermore that F is the standard 
uniform c.d.f. and that a — 2. Then, J'^''^ is clearly bounded and continuous 
almost everywhere w.r.t. and we have q;( J^'^, F) — 1/4 and J^''^, F) — 
1/112. 

The dotted lines in Figure 2 represent the p.d.f. of the Choquet integral w.r.t. 

V estimated by the kernel method from 10 000 randomly generated realiza- 
tions for n = 3,5,10 and 20. The solid lines represent the normal p.d.f.s 
7V(E[FJ^„], V[FJ^„]), where E[FJ^J and V[FJ^J are computed by means of (9) 
and (10). 

From the previous example, it clearly appears that one strong prerequisite 
before being able to apply the previous theoretical results is the knowledge 
of the expression of the game v in terms of n. In practical applications of 
aggregation operators, this is rarely the case as is usually determined for 
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normal uniform exponential 




Fig. 3. Approximations of the p.d.f.s of discrete Choquet integrals by mixtures 
of normals (solid lines) in the standard normal, standard uniform and standard 
exponential cases. The dotted lines represent the corresponding p.d.f.s estimated by 
the kernel method from 10 000 randomly generated realizations. 



some fixed n from learning data (see e.g. [29]). It follows that in such situations 
the above theoretical conditions cannot be rigorously verified. 

In informal terms, Stigler [28] states that a hnear combination of order statis- 
tics is likely to be asymptotically normally distributed if the extremal order 
statistics do not contribute "too much", which is satisfied is the weights are 
"smooth" and "bounded". When dealing with a Choquet integral, several nu- 
merical indices could be computed to assess whether the operator behaves in 
a too conjunctive (minimum-like) or too disjunctive (maximum-like) way. One 
such index is the degree of orness studied in [9,10]. 

Example 5 Consider again the capacity given in Example 1. The degree of or- 
ness of this capacity, computed using the kappalab R package, is 0.49, which 
indicates a fairly neutral (slightly conjunctive) behavior. The solid lines in 
Figure 3 represent the mixtures of 3\ = 6 normals in the standard normal, 
standard uniform and standard exponential cases as possible approximations 
of the p.d.f. of the corresponding Choquet integral. As previously, the dotted 
lines represent the p.d.f.s estimated by the kernel method from 10 000 ran- 
domly generated realizations. As one can see, the approximation is very good 
in the standard normal case, may be considered as acceptable in the standard 
uniform case, and poor in the exponential case. Provided considering such a 
approximation is valid (which, as discussed above, cannot be verified), one 
could argue that the poor results in the exponential case are due to the too low 
value of n{= 3). Although such low values for n make no sense in statistics, 
in multicriteria decision aiding for instance, they are quite common. In fact, 
in practical decision problems involving aggregation operators, the value of n 
is very rarely greater than 10. 
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